Approximate Policy Iteration for Closed-Loop Learning of Visual Tasks

نویسندگان

  • Sébastien Jodogne
  • Cyril Briquet
  • Justus H. Piater
چکیده

Approximate Policy Iteration (API) is a reinforcement learning paradigm that is able to solve high-dimensional, continuous control problems. We propose to exploit API for the closed-loop learning of mappings from images to actions. This approach requires a family of function approximators that maps visual percepts to a real-valued function. For this purpose, we use Regression Extra-Trees, a fast, yet accurate and versatile machine learning algorithm. The inputs of the Extra-Trees consist of a set of visual features that digest the informative patterns in the visual signal. We also show how to parallelize the Extra-Tree learning process to further reduce the computational expense, which is often essential in visual tasks. Experimental results on real-world images are given that indicate that the combination of API with Extra-Trees is a promising framework for the interactive learning of visual tasks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Closed-Loop Learning of Visual Control Policies

In this dissertation, I introduce a general, flexible framework for learning direct mappings from images to actions in an agent that interacts with its surrounding environment. This work is motivated by the paradigm of purposive vision. The original contributions consist in the design of reinforcement learning algorithms that are applicable to visual spaces. Inspired by the paradigm of local-ap...

متن کامل

Unifying Value Iteration, Advantage Learning, and Dynamic Policy Programming

Approximate dynamic programming algorithms, such as approximate value iteration, have been successfully applied to many complex reinforcement learning tasks, and a better approximate dynamic programming algorithm is expected to further extend the applicability of reinforcement learning to various tasks. In this paper we propose a new, robust dynamic programming algorithm that unifies value iter...

متن کامل

On Temporal Evolution in Data Streams

The future of CiteSeer : CiteSeer[superscript x] p. 2 Learning to have fun p. 3 Winning the DARPA grand challenge p. 4 Challenges of urban sensing p. 5 Learning in one-shot strategic form games p. 6 A selective sampling strategy for label ranking p. 18 Combinatorial Markov random fields p. 30 Learning stochastic tree edit distance p. 42 Pertinent background knowledge for learning protein gramma...

متن کامل

A multi-objective model for closed-loop supply chain optimization and efficient supplier selection in a competitive environment considering quantity discount policy

Supplier selection and allocation of optimal order quantity are two of the most important processes in closed-loop supply chain (CLSC) and reverse logistic (RL). So that providing high quality raw material is considered as a basic requirement for a manufacturer to produce popular products, as well as achieve more market shares. On the other hand, considering the existence of competitive environ...

متن کامل

Efficient Approximate Policy Iteration Methods for Sequential Decision Making in Reinforcement Learning

(Computer Science—Machine Learning) EFFICIENT APPROXIMATE POLICY ITERATION METHODS FOR SEQUENTIAL DECISION MAKING IN REINFORCEMENT LEARNING

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006